Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker: support for requesting chunks without end offset #2391

Merged

Conversation

giuseppe
Copy link
Member

@giuseppe giuseppe commented May 2, 2024

if the specified length for the range is set to -1, then request all the data possible by using the "Range: =-" syntax for the HTTP range.

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

  • How this is happening? I can’t see any c/storage caller setting this to -1 .
  • The Length == MaxUint64 semantics is, AFAICS, not documented, and at least mergeMissingChunks is doing ordinary arithmetic on the values.
  • If this should exist at all, blobChunkAccessorProxy, splitHTTP200ResponseToPartial, streamChunksFromFile do not handle the special case.

@giuseppe
Copy link
Member Author

giuseppe commented May 2, 2024

the error I am seeing is that we pass the value -1 to

  • How this is happening? I can’t see any c/storage caller setting this to -1 .

the problem is here https://github.com/containers/storage/blob/main/pkg/chunked/storage_linux.go#L1611-L1617

in some cases the blobSize is set to -1 from c/image, e.g. when pulling quay.io/libpod/alpine-with-seccomp:label.

I could change the code to not use any range (if there are no ranges, then request the whole file), but I thought this would be more flexible. We can restrict it so that only the last chunk can have len=-1

@mtrmac
Copy link
Collaborator

mtrmac commented May 2, 2024

OK, so this is

  • a schema1 image (unknown layer size)
  • the transparent conversion path (needs to read the full file)
  • ImageSourceSeekable not having a “read all” method at all.

Pedantically, when the size is unknown, while GetDiffer requires it (all three implementations currently do, and there is no “unknown” value documented, the locally correct thing to do is for PutBlobPartial to fail without even trying.

If you do want the conversion path to work on these images… yes, I think adding the “length unknown” case to ImageSourceSeekable / BlobChunkAccessor is the simplest approach. (Notably blobChunkAccessorProxy is simpler when there is only one API to call, having to add a whole separate “no-partial download” case to that would not be worth it). So, document the field value, allow it only for the last chunk.


But, at a higher level, this is really another argument to say that the transparent conversion should not be via PutBlobPartial/GetDiffer at all: let it work like an ordinary non-chunked pull, letting c/image deal with the whole blob and digest verification, and then do the conversion all the way down in overlay’s ApplyDiff.

@giuseppe
Copy link
Member Author

giuseppe commented May 2, 2024

OK, so this is

  • a schema1 image (unknown layer size)
  • the transparent conversion path (needs to read the full file)
  • ImageSourceSeekable not having a “read all” method at all.

Pedantically, when the size is unknown, while GetDiffer requires it (all three implementations currently do, and there is no “unknown” value documented, the locally correct thing to do is for PutBlobPartial to fail without even trying.

If you do want the conversion path to work on these images… yes, I think adding the “length unknown” case to ImageSourceSeekable / BlobChunkAccessor is the simplest approach. (Notably blobChunkAccessorProxy is simpler when there is only one API to call, having to add a whole separate “no-partial download” case to that would not be worth it). So, document the field value, allow it only for the last chunk.

But, at a higher level, this is really another argument to say that the transparent conversion should not be via PutBlobPartial/GetDiffer at all: let it work like an ordinary non-chunked pull, letting c/image deal with the whole blob and digest verification, and then do the conversion all the way down in overlay’s ApplyDiff.

we need to be able to pull via the PutBlobPartial when convert_images is enabled. composefs needs it to calculate the digests and enable fs-verity for each file.

@mtrmac
Copy link
Collaborator

mtrmac commented May 2, 2024

As far as c/image is concerned, with UncompressedDigest known, the layer’s singleLayerIDComponent is going to be the DiffID, and the ComposeFS usage is ~invisible to c/image. I naively imagine that whether the call stack says GetDiffer or ApplyDiff does not really matter for fs-verity and other work.


One change that does make a difference, however, is that GetDiffer is parallelized over up to 6 layers concurrently, but layer commit is sequential. That might well be a very strong reason to do the conversion in GetDiffer, not in the commit phase, I didn’t try to measure that.

If this does help, arguably the layer extraction should be similarly made concurrent for non-chunked/non-ComposeFS layers as well. But that redesign would really be a very different conversation (and we might end up deprecating non-ComposeFS layers before that).

@giuseppe giuseppe force-pushed the docker-image-support-missing-end-range branch 2 times, most recently from 679afe4 to e09da45 Compare May 3, 2024 09:14
@giuseppe
Copy link
Member Author

giuseppe commented May 3, 2024

I've tried adding a new method GetBlob to retrieve the entire blob, but I think the current API is more flexible, with a minimal change.

I've updated the comments

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. The existing code LGTM, but other parts need updating as well.

copy/progress_bars.go Show resolved Hide resolved
pkg/blobcache/src.go Show resolved Hide resolved
internal/private/private.go Show resolved Hide resolved
docker/docker_image_src.go Outdated Show resolved Hide resolved
@@ -255,9 +256,15 @@ func splitHTTP200ResponseToPartial(streams chan io.ReadCloser, errs chan error,
}
currentOffset += toSkip
}
var reader io.Reader
if c.Length == math.MaxUint64 {
reader = body
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking:

Suggested change
reader = body
reader = body // The caller has ensured that this is the last requested chunk

@giuseppe giuseppe force-pushed the docker-image-support-missing-end-range branch from e09da45 to 098926a Compare May 3, 2024 20:05
@giuseppe
Copy link
Member Author

giuseppe commented May 3, 2024

thanks, addressed your comments in the last version

@mtrmac mtrmac added the kind/feature A request for, or a PR adding, new functionality label May 13, 2024
@rhatdan
Copy link
Member

rhatdan commented May 13, 2024

Seems like there is one comment by @mtrmac that was not addressed?

@giuseppe
Copy link
Member Author

Seems like there is one comment by @mtrmac that was not addressed?

I thought I've addressed them all, which one is still missing?

@rhatdan
Copy link
Member

rhatdan commented May 13, 2024

if the specified length for the range is set to -1, then request all
the data possible by using the "Range: <unit>=<range-start>-" syntax
for the HTTP range.

Signed-off-by: Giuseppe Scrivano <[email protected]>
@giuseppe giuseppe force-pushed the docker-image-support-missing-end-range branch from 098926a to b47707f Compare May 14, 2024 08:37
@giuseppe
Copy link
Member Author

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mtrmac mtrmac merged commit 984c624 into containers:main May 14, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature A request for, or a PR adding, new functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants